AITopics

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Mikio Aoi, Jonathan W. Pillow

Model-based targeted dimensionality reduction for neuronal population data

Neural Information Processing SystemsFeb-13-2026, 14:35:29 GMT

Neural Information Processing Systems http://nips.cc/

dimensionality, subspace, task variable, (14 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.42)

Mikio Aoi, Jonathan W. Pillow

Model-based targeted dimensionality reduction for neuronal population data

Neural Information Processing SystemsNov-20-2025, 18:03:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, task variable, (16 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.42)

Bicher, Martin, Viehauser, Maximilian, Giannandrea, Daniele, Kastinger, Hannah, Brunmeir, Dominik, Rippinger, Claire, Urach, Christoph, Popper, Niki

GEPOC Parameters -- Open Source Parametrisation and Validation for Austria, Version 2.0

arXiv.org Artificial IntelligenceNov-4-2025

GEPOC, short for Generic Population Concept, is a collection of models and methods for analysing population-level research questions. For the valid application of the models for a specific country or region, stable and reproducible data processes are necessary, which provide valid and ready-to-use model parameters. This work contains a complete description of the data-processing methods for computation of model parameters for Austria, based exclusively on freely and publicly accessible data. In addition to the description of the source data used, this includes all algorithms used for aggregation, disaggregation, fusion, cleansing or scaling of the data, as well as a description of the resulting parameter files. The document places particular emphasis on the computation of parameters for the most important GEPOC model, GEPOC ABM, a continuous-time agent-based population model. An extensive validation study using this particular model was made and is presented at the end of this work.

artificial intelligence, federalstate, source 5, (12 more...)

2511.00048

Country: Europe > Austria > Vienna (0.15)

Genre: Research Report > New Finding (0.47)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (1.00)
Government > Regional Government (1.00)
Government > Immigration & Customs (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Tao, Jiashu, Shokri, Reza

(Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs

arXiv.org Artificial IntelligenceOct-10-2025

Machine learning models are known to leak sensitive information, as they inevitably memorize (parts of) their training data. More alarmingly, large language models (LLMs) are now trained on nearly all available data, which amplifies the magnitude of information leakage and raises serious privacy risks. Hence, it is more crucial than ever to quantify privacy risk before the release of LLMs. The standard method to quantify privacy is via membership inference attacks, where the state-of-the-art approach is the Robust Membership Inference Attack (RMIA). In this paper, we present InfoRMIA, a principled information-theoretic formulation of membership inference. Our method consistently outperforms RMIA across benchmarks while also offering improved computational efficiency. In the second part of the paper, we identify the limitations of treating sequence-level membership inference as the gold standard for measuring leakage. We propose a new perspective for studying membership and memorization in LLMs: token-level signals and analyses. We show that a simple token-based InfoRMIA can pinpoint which tokens are memorized within generated outputs, thereby localizing leakage from the sequence level down to individual tokens, while achieving stronger sequence-level inference power on LLMs. This new scope rethinks privacy in LLMs and can lead to more targeted mitigation, such as exact unlearning.

large language model, machine learning, natural language, (17 more...)

2510.05582

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.62)

Neural Information Processing SystemsFeb-7-2025, 17:07:10 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

High-dimensional neural spike train analysis with generalized count linear dynamical systems This paper describes a general exponential-family model (called the "generalized count" (GC) distribution) for multi-neuron spike count data. The model accounts for both under-dispersed and over-dispersed spike count data, and has Poisson, Negative Binomial, Bernoulli, and several other classic models as special cases. The authors give a clear account of the relationship to other models, and demonstrate the need for a model to capture under-dispersed counts in primate motor cortex. They then describe an efficient method for maximum-likelihood fitting (and demonstrate concavity of the log-likelihood). They derive an efficient variational Bayesian inference method and apply the model to data from primate motor cortex, showing that it accounts more accurately for variance and cross-covariance of spike count data, compared to a model with Poisson observations.

author feedback and meta-review, export review, spike count data, (6 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.58)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.57)

Qiu, Hongxiang, Tchetgen, Eric Tchetgen, Dobriban, Edgar

Efficient and Multiply Robust Risk Estimation under General Forms of Dataset Shift

arXiv.org Machine LearningJun-29-2023

Statistical machine learning methods often face the challenge of limited data available from the population of interest. One remedy is to leverage data from auxiliary source populations, which share some conditional distributions or are linked in other ways with the target domain. Techniques leveraging such \emph{dataset shift} conditions are known as \emph{domain adaptation} or \emph{transfer learning}. Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population. In this paper, we study the general problem of efficiently estimating target population risk under various dataset shift conditions, leveraging semiparametric efficiency theory. We consider a general class of dataset shift conditions, which includes three popular conditions -- covariate, label and concept shift -- as special cases. We allow for partially non-overlapping support between the source and target populations. We develop efficient and multiply robust estimators along with a straightforward specification test of these dataset shift conditions. We also derive efficiency bounds for two other dataset shift conditions, posterior drift and location-scale shift. Simulation studies support the efficiency gains due to leveraging plausible dataset shift conditions.

artificial intelligence, condition ds, machine learning, (16 more...)

arXiv.org Machine Learning

2306.16406

Country:

North America > United States > New York > New York County > New York City (0.14)
Africa > South Africa (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

arXiv.org Artificial IntelligenceMay-4-2023

Rethinking Population-assisted Off-policy Reinforcement Learning

Zheng, Bowen, Cheng, Ran

While off-policy reinforcement learning (RL) algorithms are sample efficient due to gradient-based updates and data reuse in the replay buffer, they struggle with convergence to local optima due to limited exploration. On the other hand, population-based algorithms offer a natural exploration strategy, but their heuristic black-box operators are inefficient. Recent algorithms have integrated these two methods, connecting them through a shared replay buffer. However, the effect of using diverse data from population optimization iterations on off-policy RL algorithms has not been thoroughly investigated. In this paper, we first analyze the use of off-policy RL algorithms in combination with population-based algorithms, showing that the use of population data could introduce an overlooked error and harm performance. To test this, we propose a uniform and scalable training design and conduct experiments on our tailored framework in robot locomotion tasks from the OpenAI gym. Our results substantiate that using population data in off-policy RL can cause instability during training and even degrade performance. To remedy this issue, we further propose a double replay buffer design that provides more on-policy data and show its effectiveness through experiments. Our results offer practical insights for training these hybrid methods.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

doi: 10.1145/3583131.3590512

2305.02949

Country:

North America > United States (1.00)
Europe (0.70)

Genre: Research Report > New Finding (0.86)

Industry:

Leisure & Entertainment > Games (0.67)
Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Arkangil, Eren, Yildirimoglu, Mehmet, Kim, Jiwon, Prato, Carlo

A deep learning framework to generate realistic population and mobility data

arXiv.org Artificial IntelligenceNov-14-2022

Census and Household Travel Survey datasets are regularly collected from households and individuals and provide information on their daily travel behavior with demographic and economic characteristics. These datasets have important applications ranging from travel demand estimation to agent-based modeling. However, they often represent a limited sample of the population due to privacy concerns or are given aggregated. Synthetic data augmentation is a promising avenue in addressing these challenges. In this paper, we propose a framework to generate a synthetic population that includes both socioeconomic features (e.g., age, sex, industry) and trip chains (i.e., activity locations). Our model is tested and compared with other recently proposed models on multiple assessment metrics.

artificial intelligence, deep learning, machine learning, (17 more...)

2211.07369

Country:

Oceania > Australia > Queensland (0.04)
Europe > France > Île-de-France (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)